Use `Unaccent` with dandiset search filter #2142

jjnesbitt · 2025-01-17T18:41:43Z

Closes #1941

This change normalized accent characters when searching dandisets. For example (from that issue), the words Buzsaki and Buzsáki would now resolve to the same word in search.

It seems the introduction of this change has no real impact on the query performance. In fact, perhaps due to now using a less complex search function overall (since we're no longer using the DRF SearchFilter class), the performance seems to be ever so slightly better.

Postgres docs for the unaccent extension: https://www.postgresql.org/docs/current/unaccent.html
Django docs for unaccent: https://docs.djangoproject.com/en/5.1/ref/contrib/postgres/lookups/#unaccent

Of note, we can't simply use the __unaccent__ lookup without any other changes, as is shown in the Django docs, because the metadata field is a JSONField, and that lookup only works on Charfield and Textfield.

kabilar · 2025-01-17T19:18:57Z

Thanks @jjnesbitt. cc @bendichter

jjnesbitt · 2025-01-20T22:12:48Z

Just realized I never wrote any tests for this. I'll do that.

dandiapi/api/views/dandiset.py

bendichter · 2025-01-25T17:58:23Z

Would you mind including a screenshot of a search for "Buzsaki" with this new change?

jjnesbitt · 2025-01-27T20:02:32Z

Would you mind including a screenshot of a search for "Buzsaki" with this new change?

Sure, here that is. Update: @waxlamp pointed out that there's multiple results for the same dandiset in this image. It seems an entry is being returned for every version in that dandiset. I'll work to fix that now.

Okay I've addressed that problem. Below is the updated screenshot.

bendichter · 2025-01-27T22:15:44Z

This looks right to me

mvandenburgh

LGTM

dandibot · 2025-01-28T19:46:09Z

🚀 PR was released in v0.4.14 🚀

bendichter · 2025-01-28T19:56:03Z

Thanks for pushing this through, @jjnesbitt !

jjnesbitt requested review from waxlamp and mvandenburgh January 17, 2025 18:41

waxlamp marked this pull request as draft January 21, 2025 23:41

waxlamp reviewed Jan 21, 2025

View reviewed changes

dandiapi/api/views/dandiset.py Show resolved Hide resolved

waxlamp reviewed Jan 21, 2025

View reviewed changes

dandiapi/api/views/dandiset.py Outdated Show resolved Hide resolved

Add migration to activate the unaccent extension

145b5d2

jjnesbitt force-pushed the unaccent-search branch from 5bcb413 to a9fc509 Compare January 27, 2025 21:05

jjnesbitt added 2 commits January 27, 2025 15:09

Use Unaccent with dandiset search filter

cab3dbb

Add test for accented character search

989b793

jjnesbitt force-pushed the unaccent-search branch from a9fc509 to 989b793 Compare January 27, 2025 22:09

jjnesbitt marked this pull request as ready for review January 27, 2025 22:13

Fix linting issue

e4013f1

mvandenburgh approved these changes Jan 28, 2025

View reviewed changes

jjnesbitt added patch Increment the patch version when merged release Create a release when this pr is merged labels Jan 28, 2025

jjnesbitt merged commit 1290ea6 into master Jan 28, 2025
11 checks passed

jjnesbitt deleted the unaccent-search branch January 28, 2025 19:45

dandibot added the released This issue/pull request has been released. label Jan 28, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Use `Unaccent` with dandiset search filter #2142

Use `Unaccent` with dandiset search filter #2142

jjnesbitt commented Jan 17, 2025 •

edited

Loading

kabilar commented Jan 17, 2025

jjnesbitt commented Jan 20, 2025

bendichter commented Jan 25, 2025

jjnesbitt commented Jan 27, 2025 •

edited

Loading

bendichter commented Jan 27, 2025

mvandenburgh left a comment

dandibot commented Jan 28, 2025

bendichter commented Jan 28, 2025

Use Unaccent with dandiset search filter #2142

Use Unaccent with dandiset search filter #2142

Conversation

jjnesbitt commented Jan 17, 2025 • edited Loading

kabilar commented Jan 17, 2025

jjnesbitt commented Jan 20, 2025

bendichter commented Jan 25, 2025

jjnesbitt commented Jan 27, 2025 • edited Loading

bendichter commented Jan 27, 2025

mvandenburgh left a comment

Choose a reason for hiding this comment

dandibot commented Jan 28, 2025

bendichter commented Jan 28, 2025

Use `Unaccent` with dandiset search filter #2142

Use `Unaccent` with dandiset search filter #2142

jjnesbitt commented Jan 17, 2025 •

edited

Loading

jjnesbitt commented Jan 27, 2025 •

edited

Loading